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ABSTRACT 

This  paper  presents  a  simulation  and  comparison  of  two  different  infrared  (IR)  imaging  systems  in  terms 
of  their  use  in  automotive  collision  avoidance  and  vision  enhancement  applications.  The  first  half  of  this 
study  concerns  the  simulations  of  a  "cooled"  shortwave  focal  plane  array  infrared  imaging  system,  and  an 
"uncooled"  focal  plane  array  infrared  imaging  system.  This  is  done  using  the  United  States  Army's  Tank- 
Automotive  Research  Development  and  Engineering  Center’s  (TARDEC)  Thermal  Image  Model  - 
(TTIM).  Visual  images  of  automobiles  as  seen  through  a  forward  looking  infrared  sensor  are  generated, 
by  using  TTIM,  under  a  variety  of  viewing  range,  and  rain  conditions.  The  second  half  of  the  study 
focuses  on  a  comparison  between  the  two  simulated  sensors.  This  comparison  is  undertaken  from  the 
standpoint  of  the  ability  of  a  human  observer  to  detect  potential  (collision)  targets,  when  looking  through 
the  two  different  sensors  A  measure  of  the  target's  detectability  is  derived  for  each  sensor  by  using  the 
TARDEC’s  Visual  Model  (TVM).  The  authors  found  the  uncooled  pyroelectric  FPA  to  give  excellent 
imagery  and,  combined  with  the  advantages  of  the  7.5-13.5  band  in  the  atmosphere  and  the  higher 
blackbody  exitance  in  the  7.5-13.5  band,  the  7.5-13.5  uncooled  sensor  is  therefore  the  better  choice  for 
imaging  through  numerous  atmospheric  conditions  compared  to  the  3.4-5. 5  cooled  sensor. 
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2.  INTRODUCTION 

Collision  avoidance  and  vision  enhancement  systems  are  seen  as  an  integral  part  of  the  next 
generation  of  active  automotive  safety  devices[l,  2],  Automotive  manufacturers  are  evaluating  a  variety  of 
imaging  sensors  for  their  usefulness  in  such  systems  [1],  One  potential  application  for  automobiles  is  a 
driver’s  vision-enhancement  systemfl  1],  This  use  of  night  vision  sensors  as  a  safety  feature  would  allow 
drivers  to  see  objects  at  a  distance  of  about  1500  ft.,  far  beyond  the  range  of  headlights.  Obstacles  in  the 
drivers  peripheral  visual  field  of  view  could  be  seen  and  recognized  much  sooner.  Sensors  that  operate  at 
wavelengths  close  to  the  eletromagnetic  frequency  band  of  human  vision  (such  as  video  cameras)  provide 
images  that  have  good  spatial  resolution.  However,  the  quality  of  the  images  (in  terms  of  relative  contrast 
and  spatial  resolution)  acquired  by  such  a  camera  degrades  drastically  under  conditions  of  poor  light,  rain, 
fog,  smoke,  etc..  One  way  to  overcome  such  poor  conditions  is  to  choose  an  imaging  sensor  that  operates 
at  longer  (than  visual)  wavelengths.  The  relative  contrast  in  images  acquired  from  such  sensors  does  not 
degrade  as  drastically  under  poor  visibility  conditions.  However,  this  characteristic  comes  at  a  cost;  the 
spatial  resolution  of  the  image  provided  by  such  sensors  is  less  than  that  provided  by  a  video  camera. 
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Passive  infrared  sensors  operate  at  a  wavelength  slightly  longer  that  the  visual  spectrum.  (The 
visual  spectrum  is  between  0.4  and  0.7  microns,  and  the  commonly  used  portions  of  the  infrared  spectrum 
are  in  the  atmospheric  “windows”  that  reside  between  0.7  and  14  microns).  Hence  the  IR  sensors  perform 
better  than  a  video  camera  (in  terms  of  relative  contrast)  when  the  visibility  conditions  are  poor.  Also, 
since  their  wavelength  of  operation  is  only  slightly  longer,  the  quality  of  the  image  provided  by  an 
infrared  sensor  is  comparable  to  that  of  a  video  camera  (in  terms  of  spatial  resolution).  As  a  result, 
infrared  sensors  have  much  potential  for  use  in  automotive  collision  avoidance  systems  [1,  3], 

Of  all  the  different  types  of  infrared  detector  technologies  there  are  two  state-of-the-art  infrared 
detectors  considered  in  this  paper,  that  offer  benefcial  alternatives  when  it  comes  to  an  infrared  sensor 
system  for  automotive  and  surveillance  applications.  The  first  alternative  is  based  on  a  cooled  focal  plane 
array  (FPA)  of  CMOS  PtSi  infrared  detectors  that  operate  in  the  3.4  -5.5pm  wavelength  band.  The 
second  alternative  is  based  on  a  staring  uncooled  barium  strontium  titanate  (BST)  FPA  of  ceramic  sensors 
that  operate  in  the  7.5  - 13.5  micron  wavelength  band.  Under  clear  atmospheric  conditions  and  at  ranges 
less  than  500  meters  the  3. 4-5. 5  micron  systems  generate  images  with  less  contrast  than  the  7.5-13.5 
micron  system.  Dual-band  field  data  show  that  the  3. 4-5. 5  band  systems  present  more  contrast  between 
temperature  extremes  whereas  the  7.5-13.5  band  systems  show  more  detail  in  the  overall  picture. 

The  TACOM  Thermal  Image  Model  (TTIM)  is  a  computer  model  that  simulates  the  appearance 
of  a  thermal  scene  as  seen  through  an  IR  imaging  system  [61.  TTIM  can  simulate  the  sampling  effects  of 
the  older  single  detector  scanning  systems,  as  well  as  more  modem  systems  that  use  focal  plane  staring 
arrays.  TTIM  can  also  model  image  intensifiers.  A  typical  TTIM  simulation  incorporates  the  image 
degrading  effects  of  several  possible  atmospheric  conditions,  by  using  LOWTRAN  -  a  computer  model  of 
the  effects  of  atmosphere  conditions  on  thermal  radiation  that  was  developed  at  the  United  States  Air 
Force's  Geophysics  Laboratory.  A  particularly  attractive  feature  of  TTIM  is  that  is  produces  a  simulated 
image  for  the  viewer,  not  a  set  of  numbers  as  some  of  the  other  simulations  do.  We  refer  the  reader  to  Fig. 
1  for  schematic  representation  of  TTIM. 


Figure  1:  Schematic  representation  of  TTIM 
In  the  first  half  of  this  paper  we  use  TTIM  to  simulate  the  cooled  and  uncooled  infrared  (IR) 
imaging  systems,  and  compare  their  performance  from  the  standpoint  of  automotive  applications. 
Analogous  comparisons  exist  in  the  current  literature  [4,  5].  However,  it  is  our  opinion  that  such  studies 
are  not  applicable  for  the  situation  at  hand.  TTIM  and  TVM  together  allow  the  comparison  of  the 
performance  of  the  two  IR  systems  in  terms  of  how  good  the  quality  of  their  images  is  for  subsequent 
human  perception/interpretation.  The  existing  studies  do  not  allow  such  comparisons. 


The  comparison  of  system  performance  leads  us  to  the  second  half  of  this  paper.  Given  that  we 
have  two  images  of  the  same  scene,  captured  by  using  the  two  different  infra-red  systems,  we  use  TVM  to 
assess  which  of  the  two  is  "better".  TVM  is  a  computational  model  of  the  human  visual  system  [7], 
Within  the  functional  area  of  signature  analysis,  the  unclassified  model  consists  of  two  parts:  early  human 
vision  modeling  and  signal  detection.  The  early  vision  part  of  the  model  itself  is  made  up  of  two  basic 
parts,  the  first  part  is  a  color  separation  module,  and  the  second  part  is  a  spatial  frequency  decomposition 
module.  The  color  separation  module  is  akin  to  the  human  visual  system.  The  spatial  frequency 
decomposition  system  is  based  on  a  Gaussian-Laplacian  pyramid  framework.  Such  pyramids  are  special 
cases  of  wavelet  pyramids,  and  they  represent  a  reasonable  model  of  spatio-ffequency  channels  in  early 
human  vision  [8] .  We  refer  the  reader  to  Fig.  2  for  a  schematic  representation  of  TVM. 


Figure  2:  Schematic  representation  of  TVM 
3.  SIMULATION  OF  INFRA-RED  SENSORS 

This  section  presents  the  simulation  of  cooled  and  uncooled  infrared  imaging  systems  using 
TTIM.  Specifically,  we  use  as  input  to  TTIM  actual  thermal  images  of  commercial  vehicles  in  a  typical 
road  scene  and  then  resample  the  image  using  TTIM.  The  initial  infrared  images  were  taken  at  TARDEC 
with  the  pyroelectric  sensor  from  Texas  Instruments  (TI).  We  present  examples  of  how  the  rain  affects 
the  quality  of  the  sensor  displayed  image  Throughout  this  paper  “target”  shall  be  synonomous  with  the 
“object-of-interest”  in  the  scene  and  “no-target”  shall  mean  the  image  with  the  “object-of-interest 
removed.” 


We  see  this  type  of  simulation  as  a  substantial  first  step,  and  as  providing  a  means  to 
comprehensively  evaluate  and  compare  the  sensor  systems  for  commercial  use  in  the  future.  Our  ability  to 
simulate  the  sensors  provides  a  means  for  exactly  repeating  imaging  experiments  and  measurements, 
something  that  is  difficult  to  achieve  in  field  trials.  Also  based  on  our  experience,  the  ability  to  simulate 
the  sensors  provides  us  with  the  ability  to  exercise  precise  control  over  the  imaging  conditions.  In  the 
cooled  infrared  systems,  for  example,  it  is  important  to  provide  proper  temperature  shielding  during  field 
trials.  Otherwise,  the  quality  of  the  images  acquired  from  the  infrared  system  is  badly  affected,  and  it 
negatively  impacts  the  validity  of  subsequent  comparisons  between  sensor  systems.  By  simulating  cooled 
infrared  systems  we  can  overcome  such  difficulties. 


In  Figure  6  we  present  simulated  infrared  images  of  typical  commercial  vehicles  when  the 
viewing  distance  (the  distance  between  the  vehicle  and  the  sensor)  is  fixed,  and  the  amount  of  rainfall 
under  which  the  image  is  acquired  increases.  This  is  done  for  both  the  cooled  and  uncooled  cases  by 
inputing  into  TTIM  the  thermal  image  containing  the  target  and  no-target  image.  The  images  have  been 
resampled  according  to  the  specific  sensor  and  then  degraded  by  rain  and  fog. 

4.  SENSOR  COMPARISON 

In  this  section  we  use  TVM  to  compare  the  quality  of  images  acquired  from  the  cooled  and  the 
uncooled  infra-red  imaging  systems  through  rain  and  fog.  Then,  using  TVM  we  obtain  the  SNR  and  a 
measure  of  detectability  ,  d\  in  each  of  the  images  for  a  vehicle  of  interest.  Specifically,  we  input  into 
TVM  the  target  and  no-target  images,  corresponding  to  the  infra-red  systems. 

The  TVM  Signature  Vector 

Sampling  of  image  contrast  by  the  human  visual  system  is  represented  by  a  series  of  Gaussian 
filters  that  render  approximate  derivatives  of  contrast  gradient  over  space  [9].  Each  filter  performs  a 
spatial  frequency  bandpass  operation  in  one  frontal-plane  dimension,  and  low-pass  filters  the  orthogonal 
dimension.  In  the  case  of  TVM,  these  spatial  filters  are  implemented  sequentially  with  a  simple  five-pixel 
kernel  in  each  of  two  orthogonal  frontal-plane  directions.  A  set  of  seven  bandpass  filters  centered  at 
different  spatial  frequencies  and  differing  by  one  octave  is  implemented  as  a  pyramidal  hierarchy  of  filters 
in  which  the  image  input  to  the  next  lower  filter  is  obtained  as  a  residual  of  the  operation  of  the  next 
higher  one.  TVM  applies  these  seven  bandpass  filters  across  three  color-opponent  channels,  each  of 
which  is  divided  into  two  orientations,  giving  42  channel  outputs  in  all.  Each  of  these  channel  outputs 
contains  a  constituent  of  the  original  image  that  represents  a  different  component  of  the  human  visual 
systems’s  target  detection  mechanism. 

In  a  manner  similar  to  standard  amplitude  modulation  signal  detection,  TVM  obtains  the  contrast 
modulation  energy  (CME)  of  a  single  channel  by  squaring  its  amplitude-modulated  output,  then  low-pass 
filtering  the  result  to  obtain  an  energy-envelope  function,  as  illustrated  in  Figure  3.  TVM  iterates  this 
process  across  the  preselected  target  and  background  areas  within  a  single  channel  to  obtain  the  averages 
and  the  variances  of  a  channels  target  and  background  CME’s.  The  difference  between  a  channel’s 
average  target  and  background  CME  provides  one  metric  of  a  channel’s  contribution  to  target  detection 
and  the  difference  between  target  and  background  CME  variances  provides  a  second  one  as  shown  in 
equation  (1).  A  single  signature  metric  of  channel  output  which  combines  these  measures  is  defined  by 
equations  (1),  (2)  and  (3).  These  equations  define  the  necessary  parameters  for  a  single-channel  SNR 
assessment.  The  noise  term  includes  noise  internal  to  the  eye,  which  is  a  function  of  illumination  level, 
and  a  clutter  noise  term,  which  is  estimated  from  the  CME  background  statistics. 


Figure  3.  Extracting  the  energy  envelopes  of  spatial  bandpass  filters.  Prior  to  their  combination 
according  to  TVM  rules,  the  amplitude-modulated  outputs  within  each  of  42  spatial  bandpass  channels  are 
squared  and  then  low-pass  filtered  to  obtain  their  contrast  modulation  energy  (CME)  envelopes. 
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The  TVM  signature  vector  comprises  the  42  single-channel  SNR  estimates,  each  of  which  is 
weighted  in  proportion  to  the  relative  number  of  receptive  fields  of  each  bandpass-type  found  in  human 
vision.  TVM  aggregates  these  weighted  SNR  estimates  according  to  the  cortical  pooling  model  of  Watson 
[9]  in  euation  (4),  which  specifies  that  the  density  of  receptive  fields  of  any  spatial  bandpass  type  on  the 
retinal  surface  is  an  inverse  function  of  retinal  eccentricity  from  the  fovea.  The  resulting  quantity  d  in 
equation  (5)  is  the  detectability  metric  derived  by  TVM.  The  exponent  QSNR  in  equation  (5)  is 
approximately  2,  corresponding  to  an  ideal  observer  model  for  signal  detection  theory.  As  expressed  in 
equation  (6),  d  has  a  log-linear  relationship  to  d',  the  TVM  output  parameter.  The  parameter  d'  specifies  a 
human  receiver-operator  characteristic  (ROC)  curve  from  which  detectability  in  terms  of  p(hit)  can  be 
predicted  as  a  function  of  a  given  p(fa),  and/or  a  function  of  an  observer’s  propensity  for  guessing. 
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The  detectability  measure,  d\  obtained  from  TVM  is  proportional  to  the  SNR  between  the  vehicle 
of  interest  and  the  background  (as  explained  in  Fig.  2).  Next  in  Figure  4  we  plot  the  highest  SNR  of  all 
frequency  channels  as  a  function  of  the  rain  rate.  The  curve  in  Figure  4  with  the  higher  whole  image  SNR 
is  that  of  the  uncooled  pyroelectric  FPA. 


IMAGE  SNR  VS  RAIN  RATE 


Figure  4:  Image  SNR  vs  Rain  Rate 


VEHICLE  CONSPICUfTY  VS  RAIN  RATE 


Figure  5:  Conspicuity  vs  Rain  Rate 

Figure  5  shows  the  predicted  conspicuities  of  the  vehicles  when  viewed  through  the  sensors  and 
atmosphere.  The  two  curves  in  Figure  5  are  the  detectabilities  of  the  target  vehicle  as  predicted  by  the 
visual  model.  Predictions  based  on  different  metrics  for  background  clutter  will  be  the  subject  of  a  future 
paper.  For  this  particular  case,  the  conspicuity  of  the  target  as  seen  through  the  uncooled  7.5-13 .5  band 
has  the  higher  predicted  conspicuity.  In  figure  6,  the  computer  simulated  images  are  shown  for  the 
uncooled  and  cooled  camera.  The  top  row  is  the  clear  case  with  and  without  the  object  of  interest,  which 
is  the  car  at  the  center  of  the  picture.  The  range  for  all  the  pictures  is  70  meters.  The  second  row  is  for 
the  case  of  fog.  As  one  goes  down  the  columns  of  images,  the  rain  rate  is  1,  12.5,  25,  37.5  and  50  mm/hr 
respectively.  The  images  show  that  the  longwave  uncooled  camera  provides  a  higher  contrast  picture 
under  all  conditions.  Given  two  simulated  infrared  sensor  images  from  TTIM  of  the  same  scene,  an  object 
of  interest,  and  the  background,  we  use  TVM  to  compute  a  SNR  for  the  whole  image  and  a  measure  of 
detectability  d’  for  the  object  of  interest  in  each  of  the  images.  The  image  with  the  higher  SNR  has  a 
greater  contrast  and  is  easier  to  interpret.  The  object  of  interest  in  a  scene  with  the  higher  d’  has  a  higher 
conspicuity  and  is  therefore  easier  to  see. 


Figure  6:  Uncooled  7.5-13.5  vs  cooled  3.4-4.5  camera 


5.  CONCLUSIONS 


In  this  paper  we  provided  a  simulation  of  and  a  comparison  between  cooled  and  uncooled 
infrared  imaging  systems.  This  was  done  with  a  view  towards  using  such  systems  for  automotive  collision 
avoidance  applications.  Using  TTIM,  we  successfully  simulated  both  infrared  imaging  systems.  We 
provided  simulated  images  as  seen  through  these  sensors  when  the  viewing  distance  is  constant  and  when 
the  amount  of  rainfall  under  which  the  images  are  acquired  increases.  In  a  previous  paper  [10]  the 
contrast  scaling  was  based  on  a  pooling  of  all  the  images  for  both  sensors.  In  this  paper,  the  authors  did 
the  scaling  of  contrast  manually  per  sensor  type.  This  gave  results  that  are  in  better  agreement  with  field 
data  and  sensor  performance.  The  7.5-13.5  band  has  more  background  radiance  in  the  scenes  which  tends 
to  add  more  grey  to  the  image  as  rain  rate  increases,  whereas,  the  3. 4-5. 5  band  gets  greyer  with  increasing 
rain  rate  primarily  due  to  the  radiance  loss  due  to  scattering.  These  model  predictions  are  consistent  with 
infrared  field  images  of  test  patterns,  through  both  bands,  in  the  rain.  Scattering  losses  are  compounded 
by  the  shape  of  the  Planck  blackbody  exitance  distribution.  The  shape  of  the  blackbody  curves  at  a 
temperature  of  300K  show  that  the  7.5-13.5  band  has  almost  a  factor  of  2  more  exitance.  Using  the  TVM 
we  compare  the  two  sensors.  In  each  of  the  spatial  frequency  channels  found  in  early  vision  among 
humans,  we  obtained  a  measure  of  detectability  for  an  object  and  background  of  interest. 

We  ploted  the  SNR  versus  rain  rate  for  both  the  sensors,  and  obtained  the  variation  in  the  SNR  as 
the  amount  of  rain  fall  under  which  the  images  are  acquired  increases.  Based  on  the  computer 
simulations  the  authors  performed,  our  suggestions  for  a  commercial  infrared  unit  for  use  in  collision 
avoidance  or  vision  enhancement  system  are  as  follows,  (1)  since  the  7.5-13.5  band  has  more  exitance 
than  the  3.4-5. 5  band,  and  (2)  the  transmittance  is  nearly  a  factor  of  1.5  better  in  rain  the  7.5-13.5  band, 
(3)  coupled  with  the  fact  that  the  uncooled  imagery  was  excellant  in  quality,  we  suggest  using  the  7.5-13.5 
band,  uncooled  pyroelectric  sensor.  In  addition,  the  unit  used  for  data  collection  was  in  fact  several  years 
old,  and  there  has  since  been  a  50%  increase  in  the  detector  sensitivity  along  with  improvements  in  the 
detector  uniformity  and  system  implimentation. 

Sensor  comparisons  are  one  aspect  of  collision  avoidance  and  vision  enhancement.  There  are  a 
number  of  other  human  factors  and  social  issues  as  well  associated  with  the  "science  of  collision 
avoidance"  as  pointed  out  in  [2], 
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